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INTRODUCTION 


Longs  segments  (>  1  megabase)  of  homozygous  DNA  are  common  in  the  genomes  of  outbred 
human  populations  (1-5).  Several  lines  of  research  indicate  that  elevated  genome-wide 
homozygosity  (i.e.  autozygosity)  may  increase  breast  cancer  risk  (6-9).  For  several  cancer 
types,  small  studies  have  found  increased  germline  homozygosity  at  specific  genomic  locations, 
suggesting  these  regions  harbor  important  cancer  genes  (10,  11).  Homozygosity  mapping  is  a 
natural  extension  of  large  genome-wide  association  studies  and  has  the  potential  to  identify 
novel  breast  cancer  genes  and  provide  biological  insights.  Based  on  this  evidence,  we 
hypothesize  that  germline  autozygosity  is  more  common  in  breast  cancer  cases  than  in  controls. 
More  specifically,  we  hypothesize  that  there  are  specific  regions  of  the  genome  in  which 
homozygosity  (i.e.  “runs  of  homozygosity”  (RoHs))  are  more  common  in  breast  cancer  cases 
than  in  controls  and  that  these  regions  contain  breast  cancer-related  genes. 


BODY 


A  recent  presentation  prepared  to  the  Era  of  Hope  meeting  in  August  201 1  (Orlando,  FL) 
provides  an  overview  of  the  progress  this  project  to  date.  The  presentation  file  is  attached. 
Slides  of  this  presentation  are  referenced  throughout  this  report. 

Description  of  progress  towards  accomplishing  tasks  described  in  scope  of  work 
document: 

Task  1  data  acquisition  and  preparation  (Slide  7):  We  have  obtained  all  the  genotype  data 
needed  for  this  project  and  performed  all  standard  quality  control  procedures  (removal  of 
population  outliers  and  samples  of  poor  quality;  SNP  filtering  based  on  call  rates,  Hardy- 
Weinberg  Equilibrium,  allele  frequency;  assessment  of  population  structure). 

Task  2:  genome-wide  autozygosity  analysis  (Slides  8  -18):  We  have  identified  runs  of 
homozygosity  (RoH)  in  our  data  using  two  different  methods,  as  implemented  in  the  Golden 
Helix  and  PUNK  programs,  respectively.  We  have  used  these  RoHs  to  derive  genome-wide 
measures  of  overall  homozygosity.  We  have  tested  these  measures  for  association  with 
case/control  status  and  also  performed  sub-group  analyses  by  estrogen  receptor  status. 

Task  3:  Conduct  autozygosity  mapping  analysis  (Slides  19-22):  using  the  Golden  Helix  data  on 
genome-wide  RoHs,  we  have  identified  423  regions  in  which  homozygosity  is  somewhat 
common  (>10  occurrences  in  our  dataset). 

Task  4:  CGEMS  analysis:  RoH  analysis  in  the  publically  available  CGEMS  dataset  has 
recently  been  published  by  another  group  (12),  and  we  have  compared  our  findings  to  this  work 
(see  below)  Instead,  we  will  follow-up  on  our  findings  using  an  additional  set  of  xxx  early-onset 
breast  cancer  cases  and  xxx  controls  typed  on  the  Cyto12  300K  SNP  chip. 

Task  5:  DNA  Sequencing:  We  are  further  evaluating  our  findings  in  order  to  determine  a 
promising  region  (or  regions)  for  DNA  sequencing 

Connection  to  Previous  findings  (Slide  26):  Based  on  our  preliminary  findings,  we  find  no 
overlap  between  our  most  significant  RoH  regions  and  the  regions  identified  in  a  previous 
analysis  of  RoHs  in  a  case-control  study  of  breast  cancer.  This  prior  study  focused  primarily  on 
ER+  breast  cancer,  we  restricted  this  comparison  to  our  ER+  results. 

Future  Work:  Several  additional  analyses  will  allow  us  to  further  confirm  our  results.  First,  we 
will  conduct  RoH  mapping  in  an  independent  set  of  early-onset  breast  cancer  cases  and 
controls.  This  dataset  is  of  comparable  size,  but  is  typed  on  a  different  lllumina  chip  (Cyto12: 
300,000  SNPs).  We  will  also  systematically  compare  out  Golden  Helix  RoHs  with  our  PUNK 


RoHs  to  determine  if  the  RoHs  identified  vary  substantially  by  algorithm  used.  We  will  also 
examine  probe  intensity  data  for  RoH  segments  of  interest  to  confirm  that  each  RoH  is  due  to 
homozygosity  and  not  a  deletion.  Finally,  regarding  our  proposed  sequencing  of  regions 
identified  in  RoH  analyses,  we  will  perform  such  sequencing  should  a  promising  gene  region 
emerge  from  these  analyses.  At  this  point,  we  have  not  identified  such  a  region. 


KEY  RESEARCH  ACCOMPLISHMENTS 


•  Obtaining  and  performing  quality  control  procedures  on  GWAS  data 

•  Estimation  and  description  of  RoHs  estimated  from  GWAS  data. 

•  Test  of  runs  of  homozygosity  (both  genome-wide  RoH  levels  and  individual  loci)  for 
association  with  early-onset  breast  cancer 


REPORTABLE  OUTCOMES 


•  An  abstract  describing  this  work  has  been  accepted  for  platform  and  poster  presentation 
to  the  Era  of  Hope  meeting  to  be  held  in  August  201 1  in  Orlando,  FL. 

•  This  Award  has  supported  the  post-doctoral  training  that  helped  the  P.l.  receive  several 
job  offers  for  tenure-track  faculty  positions. 


CONCLUSION  (Slide  27): 

In  this  work,  we  find  no  evidence  that  overall  RoH  content  or  specific  RoHs  contribute  to  early- 
onset  breast  cancer  risk.  However,  our  most  promising  RoHs  will  be  followed-up  in  an 
independent  sample  of  non-Hispanic  white  early-onset  breast  cancer  cases  and  controls.  We 
found  some  suggestive  association  for  total  Mb  of  RoH  and  ER+  breast  cancer  and  this  finding 
will  also  have  to  be  followed-up  in  an  independent  sample. 

The  utility  of  RoH  analysis  for  detection  of  cancer  susceptibility  loci  appears  to  be  somewhat 
limited,  at  least  in  case-control  GWA  studies  using  standard  SNP  panels  measured  on  a  few 
thousand  individuals.  The  technique  may  be  more  appropriate  in  populations  with  higher  RoH 
content.  However,  with  available  of  large  GWAS  datasets  the  lack  of  association  for  RoHs  can 
be  easily  confirmed  for  other  cancer  phenotypes. 
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Runs  of  Homozygosity  (RoH) 


RoH:  an  long  segment  of  consecutive  homozygous 
genotypes  (~1  Mb) 

-  Suggesting  the  chromosome  pair  share  an  identical  segment 

-  Relatively  common  in  human  genomes 

-  Distribution  of  RoH  frequency  and  size  varies  by  population 

RoHs  could  occur  due  to 

-  Related  parents  (close  or  distant)  (i.e.,  “autozygosity”  or  IBD) 

-  Natural  selection 

-  Hemizygosity  (e.g.  deletion) 

-  Uniparental  isodisomy 


RoHs  have  been  used  to  map  Mendelian  disease  genes 


Runs  of  Homozygosity  in  Humans 
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Rationale  for  RoH  as  a  breast 

cancer  risk  factor 

•  Some  studies  show  consanguinity  as  a  risk  factor  (human  and  mice) 

-  Reflecting  increased  RoH  content 

•  Loss  of  homozygosity  (LOH)  is  a  common  event  in  tumors 

-  In  a  similar  fashion  could  RoH  influence  tumor  formation? 

•  RoH  could  harbor  recessive  susceptibility  variants 

-  Not  easily  detectable  in  GWAS  or  in  linkage  studies? 

-  GWAS  have  found  very  few  variants  with  clear  recessive  effects 

-  RoH  mapping  has  linked  recessive  loci  to  schizophrenia  risk 

•  RoHs  have  been  used  to  map  Mendelian  disease  genes 

•  Breast  cancer  susceptibility 

-  Common  variants  (e.g.,  FGFR2),  rare  variants  of  strong  effect  (e.g., 
BRCA1/2),  and  variants  of  intermediate  frequency  (e.g.  CHEK2)" 
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Studies  of  RoH  and  Cancer  Risk 


•  Initial  Studies  (positive  findings) 

-  Bacoloc  et  al  (2008):  RoH  more  common  in  CRC  cases  (50K  SNPs) 

-  Assie  et  al.  (2008):  RoH  at  specific  loci  more  common  for  breast, 
prostate  and  head/neck  (345  micro-satellite  markers) 

•  Colorectal  cancer 

-  Spain  et  al  (2009):  no  replication  of  Bacoloc  et  al  (550K  SNPs) 

•  Acute  lymphoblastic  leukemia  (Hosking,  2010) 

-  Homozygosity  unlikely  to  affect  risk  (292K  SNPs) 

•  Breast  and  prostate  cancer  (Enciso-Mora  2010) 

-  No  strong  evidence  that  homozygosity  increases  risk  (550K  SNPs) 

•  Early-onset  BrCa  not  yet  studied* 
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Overall  Goal 


•  Determine  if  RoHs  are  related  to  early-onset  breast 
cancer  risk 

•  Hypotheses: 

-  Overall  gernlime  RoH  is  more  common  in  breast  cancer  cases 
than  controls 

-  Homozygosity  at  specific  genomic  regions  is  more  common  in 
cases  than  controls 

•  Implications: 

-  Such  regions  are  likely  to  harbor  cancer-related  genes 
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Early-onset  breast  cancer  GWAS 


3,203  non-Hispanic  white  participants 

-  1,647cases  ,  1,556  controls 

-  From  BCFR  (California,  Ontario),  Germany,  Seattle,  Long  Island 

-  Known  BRCA1  and  BRCA2  carriers  excluded 

Typed  for  ~61 0,000  SNPs  using  lllumina  610-Quad  chip 
at  Univ.  of  Chicago 

Standard  GWAS  QC  filters  based  on  call  rates  (>0.97), 
allele  frequency  (>0.05),  and  HWE  (p>0.0001) 

Principle  components  analysis  for 

-  Exclusion  of  individual  of  non-European  ancestry 

-  Adjustment  for  variation  in  European  ancestry 


Methods  for  RoH  Detection 

Golden  Helix  RoH  module 

-  detects  RoHs  according  to: 

•  A  minimum  number  of  consecutive  homozygous  SNPs  (i.e., 
100  SNPs)  allowing  for  error  and 

•  A  minimum  physical  distance  (i.e.,  1  Mb) 

-  Creates  “clusters”  of  runs  for  association  testing 

•  Defined  as  a  continuous  set  of  SNPs  were  each  SNP  has  at 
least  10  samples  with  an  RoH  within  that  set 


PUNK  ROH  command 

-  A  “sliding  window”  approach 


Methods  for  RoH  Detection 


Full  Panel  of  SNPs  (525K  post-QC  SNPs) 

-  Avg.  density  =  1SNP/5kb 

-  31  OK  tagSNPs  or  “haplo-groups”  (at  r2  of  0.7) 

•  Reducing  information  by  -40% 


We  need  to  detect  RoHs  with  high  confidence! 

-  If  SNPs  were  independent,  a  randomly  generated  RoH  would 
occur  with  probability: 

•  (1-0.34)(#SNPs)  x  525,000  SNPs  x  3,203  participants 

•  Lencz,  et  al  2007 

-  For  a  0.05  probability,  a  run  of  -60  SNPs  is  required. 

-  Due  to  40%  reduction  in  info:  -100  SNPs  is  required  (2%  error 
allowed) 

Minimum  length  of  1Mb 

-  To  eliminates  shorter  RoHs  in  SNP-dense,  high-LD  areas 


RoH  Detection  Results 


Total  RoH  segments  detected  across  all 
individuals: 

-  Golden  Helix  method:  66,633  detected 

-  PUNK  method:  68,335  detected 

Mean  length  of  RoH: 

-  GH:  1.44  MB  and  187  SNPs 

-  PUNK  :  1.65  Mb  and  194  SNPs 

Number  of  commons  RoH  clusters  or  pools 

-  GH:  428 

-  PUNK:  N/A 


Frequency  of  Common  RoHs  (n=423) 


Length  of  “Common”  RoHs  (n=423 

•  19  RoH  >10Mb,  Including: 
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Correlations  b/t  Consanguinity  Measures 

(excluding  outliers) 
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Association  analyses 


Test  association  between  breast  cancer  status 
and 

-  Total  number  of  RoHs  per  individual 

-  Total  length  of  RoHs  per  individual 

Using  logistic  regression  adjusted  for 

-  Age  at  diagnosis/interview 

-  PCA-derived  ancestry  (5  PCs) 


Examine  by  ER  status 


Association  between  overall  RoH  and  BrCa 
risk  (1641  cases;  1554  controls) 


Number  of  RoHs 

OR 

95%  Cl 

6-18 

1.00 

Ref 

19-21 

0.81 

0.68-0.99 

22-24 

0.94 

0.77-1.14 

>25 

0.97 

0.79-1.18 

Total  Length  (Mb) 

7.5-24.1 

1.00 

Ref 

24.2-28.6 

0.82 

0.67-1.00 

28.7-33.3 

0.80 

0.66-0.98 

>33.5 

0.96 

0.79-1.17 

0.05 

0_03 

0.69 
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Association  between  overall  RoH  and  ER+ 
risk  (972  cases;  1554  controls) 


Number  of  RoHs 

OR 

95%  Cl 

6-18 

1.00 

Ref 

19-21 

0.85 

0.98-1.05 

22-24 

1.05 

0.84-1.31 

>25 

0.98 

0.77-1.23 

Total  Length  (Mb) 

7.5-24.1 

1.00 

Ref 

24.2-28.6 

0.91 

0.72-1.15 

28.7-33.3 

0.87 

0.69-1.10 

>33.5 

1.00 

0.80-1.27 

0.42 

0.25 

0.94 
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Association  between  overall  RoH  and  ER- 
risk  (455  cases;  1554  controls) 


Number  of  RoHs 

OR 

95%  Cl 

6-18 

1.00 

Ref 

19-21 

0.79 

0.60-1.05 

22-24 

0.73 
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>25 
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Total  Length  (Mb) 
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28.7-33.3 

0.68 

0.51-0.91 
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0.83 

0.62-1.12 

0.007 

0.01 

0.23 
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Locus-by-locus  association 

analyses 

Test  association  between  breast  cancer  and 

-  Binary  RoH  status  at  each  “common”  RoH 

-  “Percent  coverage”  of  total  length  of  RoH 

Using  logistic  regression  adjusted  for 

-  Age  at  diagnosis/interview 

-  PCA-derived  ancestry  (5  PCs) 

Examine  by  ER  status 

-  456  ER-  and  975  ER+ 


Association  Results 

(1641  cases;  1554  controls) 


•  P-values  do  not  strongly  deviate  from  their 
expected  distribution  under  the  null 

•Strongest  associations  observed  on 
chromosomes  11  (P=0.005)  and  10  (P=0.005) 
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Association  Results 
(942  ER+  cases) 

•  P-values  do  not  strongly  deviate  from  their 
expected  distribution  under  the  null 

•Strongest  associations  observed  on 
chromosomes  11  (P=0.002)  and  1  (P=0.007) 


■ 

■ 

■ 

■ 

■ 

\ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

. A. 

■ 

■ 

■ 

■ 

• 

■ 

■ 

■ 

■ 

■ 

■ 

■ 

*  ■ 

f  •  . 

■ 

*  ■ 

■  •> 

■ 

■ 

•  ■ 

■ 

;-..v 

■ 

1  .■ 

■ 

^.P. . 

s  1  p 

■ 

■  m 

■ 

■ 

■ 

■  ■ 

■ 

.•..A.. 

■ 

■ 

• 

■/ 

% 

■ 

■ 

■ 

..A. 

■: 

■ 

m 

.n 

■ 

■■ 

y.  jr 

i 

■i  %  M 

•  ■ . 

*  ■  \ 

■  ■  »• 

Vs 

v  . 

^  *•  . 

.  i"** 

r 

)?■: 

*• 

1  a 

■  ■ 

V 

■  .  ■■ 

/■V 

A 

h  t 

t  ( 

*  . 

4 
.  ■ 

•T 

■ 

. 

i 

■ 

.  * 
■v 

■ 

■ 

■ 

■ 

* 

> 

■ 

j 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16  17  18  1920 

X 

2 

1.5 


Chromosome 


-LoglO(P-value 


Association  Results 
(455  ER-  cases) 

•  P-values  do  not  strongly  deviate  from  their 
expected  distribution  under  the  null 

•Strongest  associations  observed  on 
chromosomes  7  (P=0.001)  and  4  (P=0.003) 
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Association  Results  (ER+) 


Number  of  RoHs 

OR 

95%  Cl 

7-18 

1.00 

Ref 

19-21 

1.11 

0.90-1.37 

22-24 

1.02 

0.82-1.26 

>25 

1.15 

0.93-1.42 

Total  Length  (Mb) 


8.8-27.9 


28.0-33.2 


33.3-39.2 


>39.3 


1.00 


1.07 


1.17 


1.21 


Ref 


0.86-1.33 


0.94-1.46 


0.97-1.51 


0.35 


0.89 


0.21 


0.54 


0.17 


0.09 
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Association  Results  (ER-) 


Number  of  RoHs 

OR 

95%  Cl 

7-18 

1.00 

Ref 

19-21 

0.92 

0.71-1.76 

22-24 

0.83 

0.64-1.08 

>25 

0.95 

0.76-1.26 

Total  Length  (Mb) 


8.8-27.9 


28.0-33.2 


33.3-39.2 


>39.3 


1.00 


0.92 


1.04 


0.99 


Ref 


0.71-1.18 


0.80-1.34 


0.76-1.29 


0.49 


0.16 


0.87 


0.50 


0.77 


0.94 
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Comparison  with  prior  study  of 
ER+  (top  6  signals  ) 


Enciso-Mora 

Pierce 

10q21.2 

1 1  q21  -11  q22. 1 

5q15-5q21.2 

11  p14.3 

6q22.31-6q22.33 

1  p31 .1 

3p22.2 

4q13.3 

3q21.2 

22q13.1 

1  p31 .3 

Xp22.2 

Conclusions 


•  Little  evidence  that  overall  RoH  content  or  specific 
RoHs  contribute  to  early-onset  breast  cancer  risk 

-  Suggestive  association  for  total  Mb  of  RoH  and  ER+ 

•  The  utility  of  RoH  mapping  for  detection  of  cancer 
susceptibility  loci  appears  to  be  somewhat  limited 

-  Underpowered  in  standard  GWA  studies? 

-  Appropriate  for  specific  populations? 

•  However,  with  available  of  large  GWAS  datasets 
associations  for  RoHs  can  be  assessed  for  other 
cancer  phenotypes 
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